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Abstract 


We consider the problem of learning a hypergraph using edge-detecting queries. In this model, 
the learner may query whether a set of vertices induces an edge of the hidden hypergraph or not. 
We show that an r-uniform hypergraph with m edges and n vertices is learnable with QOF m: 
poly(r,logn)) queries with high probability. The queries can be made in O(min(2”(logm + r}?, 
(logm+r)*)) rounds. We also give an algorithm that learns an almost uniform hypergraph of 
dimension r using 0(201+3)") .m!+? - poly(logn)) queries with high probability, where A is the 
difference between the maximum and the minimum edge sizes. This upper bound matches our 


lower bound of Q( (sty) !*2) for this class of hypergraphs in terms of dependence on m. The 
2 


queries can also be made in O((1 + A) -min(2”(logm+ +r)’, (logm+r)*)) rounds. 


Keywords: query learning, hypergraph, multiple round algorithm, sampling, chemical reaction 


network 


1. Introduction 


A hypergraph H = (V,E) is given by a set of vertices V and a set of edges E, which is a subset of 
the power set of V (E C 2”). The dimension of a hypergraph H is the cardinality of the largest set 
in E. H is said to be r-uniform if E contains only sets of size r. In this paper, we are interested in 
learning a hidden hypergraph using edge-detecting queries of the following form 


Qy(S) : does S include at least one edge of H? 


where S CV. The query Qy(S) is answered 1 or 0, indicating whether S contains all vertices of at 
least one edge of H or not. We abbreviate Qy (S) to Q(S) whenever the choice of H is clear from 
the context. This type of query may be motivated by the following scenarios. We are given a set of 
chemicals, in which some groups of chemicals react and others don’t. When multiple chemicals are 
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combined in one test tube, a reaction is detectable if and only if at least one group of chemicals in 
the tube react. 

Considerable effort, for example, Grebinski and Kucherov (1998), Beigel et al. (2001), Alon 
et al. (2004), Angluin and Chen (2004), and Alon and Asodi (2005), has been devoted to the case 
when the underlying reaction network is a graph, that is, chemicals react in pairs. Among them, 
Grebinski and Kucherov (1998), Beigel et al. (2001) and Alon et al. (2004) study the case when 
the underlying networks are Hamiltonian cycles or matchings, which have specific applications to 
genome sequencing. In this application, DNA sequences are aligned according to the reactions that 
involve the two ends of pairs of DNA sequences in certain experimental settings. The reaction graph 
can be characterized as either a Hamiltonian cycle or path (if you consider each DNA sequence as a 
vertex) or a matching (if you consider each end of a DNA sequence as a vertex). Implementations of 
some of these algorithms are in practical use. Grebinski and Kucherov (2000) also study a somewhat 
different and interesting query model, which they call the additive model, where instead of giving a 
1 or 0 answer, a query tells you the total number of edges contained in a certain vertex set. 

Angluin and Chen (2004) generalize the problem of learning with edge-detecting queries to 
general reaction graphs and show that general graphs are efficiently learnable. In this work, we 
consider a more general problem when the chemicals react in groups of size more than two, that 
is, the underlying reaction network is a hypergraph. In Angluin and Chen (2004), they give an 
adaptive algorithm which takes O(logn) queries per edge, where n is the number of vertices. This 
is nearly optimal as we can easily show using an information-theoretic argument. For the problem 
of learning hypergraphs of bounded dimension and a given number of edges, a similar information- 
theoretic argument gives a lower bound that is linear in the number of edges. However, the lower 
bound is not achievable. It is shown in Angluin and Chen (2004) that Q((2m/r)'/) edge-detecting 
queries are required to learn a general hypergraph of dimension r with m edges. In the heart of the 
construction of Angluin and Chen (2004), edges of size 2 are deliberately arranged to hide an edge 
of size r. The discrepancy in sizes of different coexisting edges is the main barrier for the learner. 
However, this lower bound does not preclude efficient algorithms for classes of hypergraphs whose 
edges sizes are close. In particular, the question whether there is a learning algorithm for uniform 
hypergraphs using a number of queries that is linear in the number of edges is still left open, which 
is the main subject of this paper. 

In this paper, we are able to answer this question affirmatively. Let n be the number of vertices 
and m be the number of edges in the hypergraph. We show that an r-uniform hypergraph is learnable 
with O(2*'m- poly(r,logn, log i) queries with probability at least 1 — 4. 

We also obtain results for learning the class of hypergraphs that is almost uniform. Formally 
speaking, 


Definition 1 A hypergraph is (r,A)-uniform, where A < r, if its dimension is r and the difference 
between its maximum and minimum edge sizes is A, or equivalently, the maximum and the minimum 
edge sizes are r and r — A respectively. 


The class of hypergraphs used in the construction of the lower bound in Angluin and Chen (2004) 
is in fact (r,r —2)-uniform. Therefore, they show that Q((2m/r)"/*) edge-detecting queries are 
required to learn a (r,r — 2)-uniform hypergraph. Based on this result, we show by a simple re- 
duction that Q(( Tre) 1+3) queries are required to learn the class of (r, A)-uniform hypergraphs. On 
the other hand, we extend the algorithm that learns uniform hypergraphs to learning the class of 
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(r, A)-uniform hypergraphs with o(20(0+3)r) .m!+? - poly(logn, log D queries with probability at 
least 1 — 6. The upper bound and lower bound have the same dependence on m. 


Another important issue studied in the literature is the parallelism of algorithms. Since the 
queries are motivated by an experiment design scenario, it is desirable that experiments can be 
conducted in parallel. Alon et al. (2004) and Alon and Asodi (2005) give lower and upper bounds for 
1-round algorithms for certain types of graphs. Beigel et al. (2001) describe an 8-round algorithm 
for learning a matching. Angluin and Chen (2004) give a 5-round algorithm for learning a general 
graph. In this paper, we show that in our algorithm for r-uniform hypergraphs, queries can be made 
in O(min(2”(logm+r)*, (logm+r)?)) rounds, and in our algorithm for (r, A)-uniform hypergraphs, 
queries can be made in O((1 +A) : min(2”(logm + r)”, (logm + r)?)) rounds. 


In the paper, we also introduce an interesting combinatorial object, which we call an indepen- 
dent covering family. Basically, an independent covering family of a hypergraph is a collection 
of independent sets that cover all non-edges. An interesting observation is that the set of negative 
queries of any algorithm that learns a hypergraph drawn from a class of hypergraphs that is closed 
under the operation of adding an edge is an independent covering family of that hypergraph. Note 
both the class of r-uniform hypergraphs and the class of (r, A)-uniform hypergraphs are closed un- 
der the operation of adding an edge. This implies that the query complexity of learning such a 
hypergraph is bounded below by the minimum size of its independent covering families. In the 
opposite direction, we give subroutines to find one arbitrary edge from a hypergraph. With the help 
of the subroutines, we show that if we can construct small-sized independent covering families for 
some class of hypergraphs, we are able to obtain an efficient learning algorithm for it. In this paper, 
we give a randomized construction of an independent covering family of size O(r2”mlogn) for 
r-uniform hypergraphs with m edges. This yields a learning algorithm using a number of queries 
that is quadratic in m, which is further improved to give an algorithm using a number of queries that 
is linear in m. 


As mentioned in Angluin and Chen (2004) and some other papers, the hypergraph learning 
problem may also be viewed as the problem of learning a monotone disjunctive normal form (DNF) 
boolean formula using membership queries only. Each vertex of H is represented by a variable and 
each edge by a term containing all variables associated with the vertices of the edge. A membership 
query assigns 1 or 0 to each variable, and is answered 1 if the assignment satisfies at least one term, 
and 0 otherwise, that is, if the set of vertices corresponding to the variables assigned 1 contains all 
vertices of at least one edge of H. An r-uniform hypergraph corresponds to a monotone r-DNF. An 
(r,A)-uniform hypergraph corresponds to a monotone DNF whose terms are of sizes in the range 
of [r — A,r]. Thus, our results apply also to learning the corresponding classes of monotone DNF 
formulas using membership queries. 


The paper is organized as follows. In Section 3, we formally define the concept of an inde- 
pendent covering family and give a randomized construction of independent covering families for 
general r-uniform hypergraphs. In Section 4, we show how to efficiently find an arbitrary edge in 
a hypergraph and give a simple learning algorithm using a number of queries that is quadratic in 
the number of edges. In Section 5, we give an algorithm that learns r-uniform hypergraphs using 
a number of queries that is linear in the number of edges. Then we derive a lower bound for al- 
most uniform hypergraphs in Section 6. Finally, we show how to learn the class of (r,A)-uniform 
hypergraphs in Section 7. 
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2. Preliminaries 


Let H = (V,E) be a hypergraph. In this paper, we assume that edges do not contain each other, 
as there is no way to detect the existence of edges that contain other edges using edge-detecting 
queries. A subset of V is an independent set of H if it contains no edge of H. We use the term 
non-edge to denote any set that is a candidate edge in some class of hypergraphs but is not an edge 
in the target hypergraph. For example, in an r-uniform hypergraph, any r-set that is not an edge is 
a non-edge. In an (r,A)-uniform hypergraph, any set of size in the range of [r — A,r] that is not an 
edge is a non-edge. The degree of a set x C V in a hypergraph H denoted as dy(%) is the number of 
edges of H that contain x. In particular, dy (0) = |E| is the number of all edges in H. 
Throughout the paper, we omit the ceiling and floor signs whenever they are not crucial. 


3. An Independent Covering Family 


Definition 2 An independent covering family of a hypergraph H is a collection of independent sets 
of H such that every non-edge not containing an edge is contained in one of these independent sets. 


When H is a uniform hypergraph, the above only requires that every non-edge is contained in 
one of the independent sets in the independent covering family. An example is shown below. 


Example 1 Let V = [1,6]. Let H = (V,{{1,2,3},{4,5,6} ,{2,4,5}}) be a 3-uniform hypergraph. 
F = {{1,2,4,6}, {1,2,5,6}, {1,3,4,5}, {1,3,4,6}, {2,3,4,6}, {2,3,5,6}} 


is an independent covering family of H. As we can easily verify, all sets in F are independent sets, 
and every triple except {1,2,3}, {4,5,6}, {2,4,5} is contained in some set in F. 


The concept of independent covering families is central in this paper. This can be appreciated 
from two aspects. 

First, we observe that if the target hypergraph is drawn from a class of hypergraphs that is 
closed under the operation of adding an edge (e.g., the class of all r-uniform hypergraphs), the 
set of negative queries of any algorithm that learns it is an independent covering family of this 
hypergraph. This is because if there is a non-edge not contained in any of the sets on which these 
negative queries are made, we will not be able to distinguish between the target hypergraph and the 
hypergraph with this non-edge being an extra edge. Therefore, the minimum size of independent 
covering families bounds the query complexity from below. Furthermore, any learning algorithm 
gives a construction of an independent covering family of the target hypergraph. Therefore, in order 
to learn the hypergraph, we have to be able to construct an independent covering family for it. 

Second, although the task of constructing an independent covering family seems substantially 
easier than that of learning, since the hypergraph is known in the construction task, we show that 
efficient construction of small-sized independent covering families yields an efficient learning algo- 
rithm. In Section 4, we will show how to find an arbitrary edge out of a hypergraph of dimension 
r using O(rlogn) queries. Imagine a simple algorithm in which at each iteration we maintain a 
sub-hypergraph of the target hypergraph which contains edges that we have found, and construct 
an independent covering family for it and ask queries on all the sets in the family. If there is a set 
whose query is answered positively, we can find at least one edge out of this set. The edge must 
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be a new edge as the set is an independent set of the sub-hypergraph that we have found. We re- 
peat this process until we have collected all the edges in the target hypergraph, in which case the 
independent covering family we construct is a proof of this fact. Suppose that we can construct an 
independent covering family of size at most f(m) for any hypergraph with at most m edges drawn 
from certain class of hypergraphs. The above algorithm learns this class of hypergraphs using only 
O(f(m)-m-rlogn) queries. 

In the rest of this section, we give a randomized construction of a linear-sized (linear in the 
number of edges) independent covering family of an r-uniform hypergraph which succeeds with 
probability at least 1/2. By the standard probabilistic argument, the construction proves the existence 
of an independent covering family of size linear in the number of edges for any uniform hypergraph. 
This construction leads to a quadratic algorithm described in Section 4, and is also a central part of 
our main algorithm given in Section 5. 

Our main theorem in this section is as follows. 


Theorem 3 Any r-uniform hypergraph with m edges has an independent covering family of size 
O(r27"mlogn). 


Before giving the construction, we introduce some notation and definitions. We call a vertex set 
xX CV relevant if it is contained in at least one edge in the hypergraph. Similarly, a vertex is relevant 
if it is contained in at least one edge in the hypergraph. Let py(x) = 1/(2'*!dy(x))!/0-), where 
xX is a relevant vertex set. We will call p;,(X) the discovery probability of x, as this is a probability 
that will help in discovering edges containing % in our learning algorithms. 


Definition 4 A (x,p)-sample is a random set of vertices that contains % and contains each other 
vertex independently with probability p. 


We will abbreviate (x, p)-sample as ¥-sample when the choice of p is clear or not important in the 
context. 

In the construction, we draw (x, px#(x))-samples independently for each relevant set x. Each 
(X, pu(%))-sample deals only with non-edges that contain x. Let us take a look at the probability 
that a (X, pH(%))-sample Py covers some non-edge z > % while excluding all edges. Due to our 
choice of py(X), 
1 


Prie SP] = pala)’ = sae 


Therefore, if we draw 2’+'dz(x%) many y-samples, the probability that z is contained in at least one 
x-sample is Q(1). However, such a x-sample is not necessarily an independent set. Especially when 
z contains a high degree subset y’, it is likely that such a y-sample contains an edges that contains ¥’. 
But since we will also draw (x’, px(x’))-samples, it is reasonable to hope that a (x’, py (x’))-sample 
has better chance of success in dealing with z. In fact, in our construction, we show that the set of 
xX-samples, where x C z has the minimum discovery probability among all relevant subsets of z, has 
an independent set that contains z with probability at least 1/2. 

A construction of an independent covering family is given in Algorithm 1, which succeeds with 
probability at least 1/2 as shown by Lemma 5. 


Lemma 5 Fy (constructed in Algorithm 1) contains an independent covering family of H with 
probability at least 1/2. 
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Algorithm 1 Construction of an independent covering family 





1: Fy — a set containing 4(In2 + rlnn) -2’dy(x%) (X, pu(%))-samples drawn independently for 
every relevant set x. 
2: Output the independent sets contained in Fy as an independent covering family. 





Proof Suppose z is a non-edge and ¥ is a subset of z with the minimum discovery probability. Let 
P, be a x-sample. As argued before, 


1 
Pre C Pd = serra)’ 


Since % has the minimum discovery probability, the degree of any subset y’ C z is at most 
1/(2"+! py(x)'—!). By the union bound, 


Pr|Py is independent|z C Py] > 1— Ds du (x) pu(y) 





KEZ 
1 , 
> 1 7 pu)! 
pa 2+ 1 p(y) | 
aio 


The probability that a y-sample contains z and is independent is at least 1/(2”*?d}(x)). Therefore, 
the probability that such a y-sample exists in Fy is at least 





1 ( l yearling) 
>]— e~ (rInn+ln2) 
jais, 
2n" 
Thus, the probability that every non-edge is contained in some negative sample in Fy is at least 
1—(")/(2n") > 1/2. E 


Theorem 3 is then established by the fact that the size of Fy is bounded by X}, 4(In2 + rInn) - 
2"dy(x) = O(r27"mlogn). 


4. A Simple Quadratic Algorithm 


In this section, we first give an algorithm that finds an arbitrary edge in a hypergraph of dimension 
r using only rlogn edge-detecting queries. The algorithm is adaptive and takes rlogn rounds. The 
success probability in the construction of independent covering families in the previous section can 
be easily improved by drawing more samples. Using the high-probability version of the construc- 
tion, we obtain an algorithm using a number of queries that is quadratic in m that learns an r-uniform 
hypergraph with m edges with high probability. Although the first algorithm for finding one edge is 
deterministic and simple, the round complexity rlogn might be too high when n is much larger than 
m. We then improve the round complexity to O(logm + r) using only O(logmlogn) more queries. 
The improved algorithm is randomized and succeeds with high probability. 
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4.1 Find One Edge 


We start with a simpler task, finding just one relevant vertex in the hypergraph. The algorithm 
FIND-ONE-VERTEX is shown in Algorithm 2. 


Algorithm 2 FIND-ONE-VERTEX 
1: S V,A-V. 
2: while |A| > 1 do 
3: Divide A arbitrarily into Ag and Aj, such that |Ao| = [|A|/2], |Ai] = ||A|/2]. 





4: if Q(S\Ao) = 0 then 
5 A — Apo. 

6: else 

7 A — A1, S — S\Ao. 
8: endif 

9: end while 


10: Output the element in A. 





Lemma 6 FIND-ONE-VERTEX finds one relevant vertex in a non-empty hypergraph with n ver- 
tices using at most logn edge-detecting queries. 


Proof First we show that the following equalities hold for each iteration (see Figure 1). 


Q(S) = 1,Q(S\A) = 0. 





Q(S\A)=0 Q(S)=1 


Figure 1: An illustration of FIND-ONE-VERTEX 


These equalities guarantee that A contains at least one relevant vertex. Since we assume that 
the hypergraph is non-empty, the above equalities clearly hold for our initial assignment of S and A. 
Let’s assume Q(S) = 1 and Q(S\A) = 0 at the beginning of an iteration. There are two cases: 


1. Q(S\Ao) = 0, clearly the equalities hold for S and Ao. 


2. Q(S\Ao) = 1, since Q((S\Ao0)\A1) = Q(S\(Ao UA1)) = Q(S\A) = 0, the equalities hold for 
S\Ao and A. 


Since the size of A halves at each iteration, after at most logn iterations, A has exactly one rele- 
vant vertex. The algorithm takes at most logn edge-detecting queries in total, as it makes one query 
in each iteration. a 
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Using FIND-ONE-VERTEX as a subroutine, FIND-ONE-EDGE (Algorithm 3) is a recursive 
algorithm that finds one edge from a non-empty hypergraph, which is not necessarily uniform. 
Note knowledge of r is not required in FIND-ONE-EDGE. It is included in the description of the 
algorithm for the purpose of explanation. 


Algorithm 3 FIND-ONE-EDGE 
1: Let r > 0 be the dimension of the hypergraph. 
: Call FIND-ONE-VERTEX and let v be the found vertex. 
: Make a query on {v}. 
: if the query is answered 1 then 
Output {v}. 
else 
FIND-ONE-VERTEX also computes a set S such that Q(S) = 1 and Q(S\{v}) = 0. That is, 
S contains only edges incident with v. 
8: Call FIND-ONE-EDGE on the hypergraph induced on S with the vertex v removed. The 
hypergraph is of dimension at most r — 1. Let e be the found edge. 
9: Output the edge eU {v}. 
10: end if 








Edge-detecting queries for recursive calls of FIND-ONE-EDGE can be simulated recursively. 
To make an edge-detecting query for a next-level recursive call of FIND-ONE-EDGE, we just need 
to make an edge-detecting query at the current level on the union of a subset of S and {v}. In fact, 
each time, we make edge-detecting queries on the union of a subset of S and the set of vertices 
already found. 

In FIND-ONE-EDGE, because S contains only edges incident with v, eU {v} is an edge in the 
hypergraph. This establishes its correctness. The following lemma shows that it uses only rlogn 
queries. 


Lemma 7 FIND-ONE-EDGE finds one edge in a non-empty hypergraph of dimension r with n 
vertices using rlogn edge-detecting queries. 


Proof When r = 1, the problem is exactly that of finding one relevant vertex and hence solvable us- 
ing logn queries. It is evident that if FIND-ONE-EDGE uses (r — 1) logn queries for a hypergraph 
with dimension r — 1. then it only uses (r — 1) logn + logn = rlogn queries for a hypergraph with 
dimension r. a 


4.2 A Quadratic Algorithm 


With the help of FIND-ONE-EDGE, we give the first learning algorithm for r-uniform hypergraphs. 
A sketch of the algorithm has been described in Section 3. Let H = (V,E) be the sub-hypergraph 
the algorithm has found so far. Algorithm 4 learns a uniform hypergraph with probability at least 
1— ò. We will specify 5’ later. 

In the algorithm we draw 4(In(4-) + rlnn) -2”dy(x) x-samples. Using essentially the same 
argument as in Section 3, we can guarantee that Fy contains an independent covering family with 
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Algorithm 4 The quadratic algorithm 
1: e — FIND-ONE-EDGE(V). E — {e}. 





2: repeat 
3: Fy —A(In gy +rlnn) -2'dy(x) (X, pH(%))-samples drawn independently for every relevant 
set % in H. 


4: Make queries on sets of Fy that are independent in H. 

5: Call FIND-ONE-EDGE on one positive sample if there exists any. Let e be the edge found. 
E —EU{e}. 

6: until no new edge found 





probability at least 1 — 6’. Algorithm 4 finds one new edge at each iteration because Fy is an 
independent covering family of the already found sub-hypergraph H. Thus, it ends after at most m 
iterations. If we we choose 6! = 6/m, it will succeed with probability at least 1 — 6. As knowledge 
of m is not assumed, we will choose 8’ = &/(") < &/m. The query complexity will be O(27"m? - 
poly(r,logn) - log x), which is quadratic in m. 


4.3 An Improved FIND-ONE-EDGE 


Despite the simplicity of FIND-ONE-EDGE, its queries have to be made in rlogn rounds. When 
irrelevant vertices abound, that is, when n is much larger than m, we would like to arrange queries in 
a smaller number of rounds. In the following, we use a technique developed in Damaschke (1998) 
(for learning monotone boolean functions) to find one edge from a non-empty hypergraph with high 
probability using only O(logm-+r) rounds and O((logm-+r) logn) queries. However, the algorithm 
is more involved. 

The new algorithm is also based on FIND-ONE-VERTEX. The process of FIND-ONE- VERTEX 
can be viewed as a binary decision tree. At each internal node, a set A is split and a decision on 
which branch to follow is made based on query results. The FIND-ONE-VERTEX algorithm does 
not restrict how we split the set A as long as we divide it into halves. In the new algorithm, we will 
pre-determine the way A’s will be divided at the very beginning of the algorithm. 


Let us index each vertex by a distinct binary number bjb2...Diogn. Each split is based on a 
certain bit. We say that we split a set A according to its i” (i € [1,logn]) bit, we will divide A into 
two sets, one containing vertices whose i” bits are 0 and the other containing vertices whose i” 
bits are 1. We will denote these two sets A|,,-9 and A|p;=1 respectively. If we split A|p;=0 further 
according to the j’" bit, we get another two sets (A]p,-0)|»,=0 and (A|b;=0)|b;=1- We will abbreviate 
these two sets as A| b;=0,b ;=0 and A| b=0,b;=1- In general, let s be a partial assignment that assigns 
some bits to 0 or 1, we use A|, to denote the set of vertices in A that match the assignments of bits 
in s. 

Using this notation and our splitting scheme, at each iteration of FIND-ONE-VERTEX, A is 
equal to V|, for some partial assignment s, and Ag and A; are equal to A|p;=0 and A|p;=1 if we 
split A according to the i” bit. One of the key ideas in Damaschke (1998) is that because the 
splits are pre-determined, and the queries are monotone in terms of subset relation, we can make 
queries on pre-determined splits to make predictions. The idea will be made clear in the rest of 
the section. PARA-FIND-ONE-VERTEX (Algorithm 5) improves the round complexity of FIND- 
ONE-VERTEX. 
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Algorithm 5 PARA-FIND-ONE-VERTEX 





1: SHV,A<—V,I<— [1,logn]. 
2: while |A| > 1 do 


> O 


12: 
13: 


14: 
15: 
16: 
17: 
18: 
19: 


Vi € I, make queries on (S\A) UA|p;=0 and (S\A) UA]p,-1. 
Let R; = (Q((S\A) UA|s,-0), Q((S\A) UA|p,=1)) be the query results for i € Z. 
case 1: i € J such that R; = (0,0) 
A—Alp-0, —I\ {i}. 
case 2: Ji € J such that R; = (1,1) 
Choose a from {0,1} uniformly at random. 
A — Aļb;=a, S — (S\A) UAlb =a, 1 = A {i}. 
case 3: Vi € I, R; = (1,0) or R; = (0,1) 
Swap the indices of vertices so that R; = (1,0) for every i € J. (If R; = (0,1), we flip the 
i" bit of all the indices, that is, for every vertex, if the i’” bit of its index is 0, we set the i” 
bit to 1 and vice versa.) 
Vi € I, let A’ = Aly jerj<ip;-0 and make a query on S' = (S\A) UA’. 
Let i* = min {i|Q(S') =O} if it exists and the largest index in J otherwise. Let j* = 
max { j|j <i*,j EI}. 
Te {ili > i*,i € I}. 
if all queries are answered 1 then 
A APS — S” (i* is the largest index in 7 in this case). 
else 
A AŤ, Se SÄ. 
end if 





20: end while 
21: Output the element in A. 





In PARA-FIND-ONE-VERTEX, the equalities Q(S) = 1,Q(S\A) = 0 are also preserved at all 
times, which establishes the correctness. We first make queries on (S\A) UA|p,-0 (= S\Alp,=1) and 
(S\A) UA|p,=1 (= S\A|p,=0) for every i. There are 3 possible query outcomes. 


case 1: 


case 2: 


If there exists i such that R; = (0,0), that is, both queries are answered 0, all edges contained 
in S are split between A|,,-9 and A|p;=1, that is, the intersections of each edge with these two 
sets are a partition of the edge. We call this case an edge-splitting event. The iterations at 
which an edge-splitting event happens are edge-splitting iterations. Since we then set A to be 
A|b;=0, the intersection of A with any edge contained in S becomes strictly smaller. Because 
we will only shrink A in other cases, the intersections will never increase. Thus, there are at 
most r — 1 edge-splitting iterations as each edge is of size at most r. 


If there exists i such that R; = (1,1), that is, both queries are answered 1, we can set S to 
be either of the two sets (S\A) UA|,,-0 and (S\A) UA|,,-1 as they both contain edges, and 
set A to be Alp, or A|p;=1 respectively. The equalities Q(S) = 1,Q(S\A) = 0 are preserved 
in this case. However, we would like to choose whichever of the two sets (S\A) UA|»,—0 
and (S\A)UA|,,-1 contains fewer edges. Because they do not share a common edge as their 
intersection S\A does not contain an edge, the sum of the numbers of edges contained in 
these two sets is at most the number of edges contained in S. If we choose the set with fewer 
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edges, we will cut the number of edges contained in S in half. With a random choice, this 
happens with probability 1/2. We will call this case an edge-separating event and call the 
corresponding iteration an edge-separating iteration. 


case 3: If neither of the two events happens, we need to deal with the third case where Vi € J, one 
of the queries is answered 0 and the other is answered 1. In this case, for convenience of 
exposition, we will flip the indices of all vertices, so that R; = (1,0) for every i € J. Thus, 
Vi € 1,Q((S\A) UA|,,-0) = 1. In this case, we can set A to A|p;=0 for some i € J, and the 
equalities Q(S) = 1,Q(S\A) = 0 are preserved. However, this won’t help us to reduce the 
round complexity as it only cuts A in half. 


Consider the next split. We shall divide A|,,-0 further into A|,,-0,5,-0 and Alp,-0,,=1 for 
some j € I, j #i. Since we already know that Q((S\A) UA|»,=1) = 0, the fact that A|p,-0,5,=1 
is a subset of Alp,—1 implies Q((S\A) UA|p,=0,5,=1) = 0. Therefore, we only need to know 
Q((S\A) UA]p,=0,5;=0)- 


(a) If it is 1, we can set A to be A|»;=0,b;=0 and continue. 


j=l 


(b) Otherwise, it is 0, an edge-splitting event takes place. 


In PARA-FIND-ONE-VERTEX, we choose the indices we use to split A in the increasing 
order of indices in 7 and make queries on SÌ = (S\A) UA! for every i € J all in parallel (recall 
that A’ = Aly jel j<i,bj=0)- If all queries are answered 1, i* is the largest index in J and AŤ is 
a singleton set containing a relevant vertex. Otherwise, we get an edge-splitting event, since 
SJ = (S\A)UA* contains edges, but both (S\A) UA” |p,.-0 and (S\A) UA” |p,.-1 don’t (note 
that j* is the index right before i* in the increasing order of indices in Z and A’ = A” | b»=0)- In 
this case, it can be verified that our updates to A and S in the third case preserve the equalities 


Q(S) = 1,Q(S\A) = 0. 


By the above analysis, the first case and the third case both result in an edge-splitting event or 
succeed in finding a relevant vertex. There are at most r such iterations. The second case results in 
an edge-separating event, in which with probability 1/2 we will cut the number of edges contained 
in S in half. We can show that in expectation there are logm edge-separating events. Therefore, 
there are logm + r iterations in expectation. At each iteration, we use at most 3 logn queries which 
are made in at most 2 rounds. Therefore, 





Lemma 8 In expectation, PARA-FIND-ONE-VERTEX finds one relevant vertex using O((logm+ 
r)logn) queries, and the queries can be made in 2(logm + r) rounds. 


PARA-FIND-ONE-VERTEX can work with FIND-ONE-EDGE to find an edge using expected 
O(r(logm+r)logn) queries in expected 2r(logm +r) rounds. In fact, we can improve the round 
complexity further to 2(logm + r) based on two observations, both of which use the fact that in the 
whole process we only shrink S. 

The first observation is that edges removed from S in the edge-separating events will not be 
considered again. Therefore, the logm bounds not only the expected number of edge-separating 
iterations of PARA-FIND-ONE- VERTEX, but also that of the whole process. 

The second observation is that the edge-splitting events can be remembered and reused when 
we try to find the next relevant vertex. Since we only shrink S, the bits that split all edges in S will 
continue to do so. Let /* be the set of edge-splitting indices. In the new vertex finding process, 
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instead of starting with A = V = S (recall in a recursive call of FIND-ONE-EDGE, we look for a 
relevant edge contained in the S. Therefore, in the recursive call, V is equal to the S we obtain in 
the previous call), we start with A = S|j<pp,-0. Note that the equalities Q(S) = 1,Q(S\A) = 0 are 
preserved. This helps us to bound the number of edge-splitting iterations by r— 1 for the whole 
process. 

Thus, we have the following lemma. 


Lemma 9 There is an algorithm that finds an edge in a non-empty hypergraph using expected 
O((logm+r) logn) edge-detecting queries. Moreover, the queries can be made in expected 2(logm+ 
r) rounds. 


Since the algorithm terminates in expected logm-+ r iterations, according to Markov’s Inequal- 
ity, with probability at least 1/2, the algorithm terminates in 2(logm + r) iterations. We convert 
it to one that succeeds with high probability by running log i copies, each of which has its own 
independent random choices. All copies are synchronized at each iteration and the algorithm ends 
when one of them succeeds. This leads to an algorithm that succeeds with high probability. We will 


refer to this algorithm as PARA-FIND-ONE-EDGE. 


Corollary 10 With probability at least 1 — ò, PARA-FIND-ONE-EDGE finds an edge using O((logm+ 
r)lognlog 5) edge-detecting queries, and the queries can be made in 4(logm + r) rounds. 


5. A Linear-Query Algorithm 


Reconstructing an independent covering family at the discovery of every new edge is indeed waste- 
ful. In this section we show how to modify the quadratic algorithm to obtain an algorithm using 
a number of queries that is linear in the number of edges. Our algorithm is optimal in terms of 
the dependence on m. Moreover, the queries can be made in O(min(2”(logm +r)’, (logm+r)°)) 
rounds. 

Before we begin to describe our algorithm, we introduce some notation and make some defini- 
tions. First we reduce the discovery probabilities. Let 


Pa (X) = 1/ (2d (yO, 


where x is a relevant vertex set. Let the best discovery probability of x be the minimum discovery 
probability among all its subsets. That is, 


Pu (X) = min p(X’). 
X EX 


Definition 11 Let py(p) be the probability that a (x, p)-sample is positive, where % is a relevant 
vertex set. 


Remark 12 p,(p) is continuous and monotonically increasing. 
Angluin and Chen (2004) contains a proof of this fact. 


Definition 13 Let py = min {p|p,(p) = 1/2"*!} be the threshold probability of a relevant vertex 
set X. 
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Remark 14 Due to the fact that py(0) = 0, py (1) = 1 and that py(p) is continuous and monotoni- 
cally increasing, the threshold probability uniquely exists. 


Note that both threshold probabilities and discovery probabilities reflect the degree of set x 
or the degrees of its subsets. The difference is that discovery probabilities reflect degrees in the 
hypergraph we have found, while threshold probabilities reflect degrees in the target hypergraph. 
Threshold probabilities are only used in analysis. 


5.1 Overview Of The Algorithm 


An “obvious” improvement to the quadratic algorithm is that instead of calling FIND-ONE-EDGE 
on one positive sample at each iteration, we can call it on all positive samples. It is plausible that this 
will yield more edges. However, there is no guarantee that different calls to FIND-ONE-EDGE will 
output different edges. For instance, calls to FIND-ONE-EDGE on two sets that share a common 
edge will produce the same edge in the worst case. We use several standard tricks to circumvent 
this obstacle. In fact, the family of samples constructed here is more complex than that used in 
Section 4, so as to ensure with high probability that the algorithm will make a certain amount of 
progress at each iteration. By doing so, we are able to reduce the number of iterations from m to 
O(min(2’(logm +r), (logm+r)7)). The number of queries will also be reduced. 

First of all, the sampling probabilities are halved in order to accommodate more edges. More 
precisely, imagine that we draw (x, 5pH(x))-samples instead of (x, p(x) )-samples in the quadratic 
algorithm. Take a look at a sample drawn several iterations ago, which the quadratic algorithm did 
not call FIND-ONE-EDGE on. Such a sample will still have reasonable probability of excluding 
all the edges that have been found, as long as the degree of x has not been increased by a factor of 
2r-ltl or equivalently the discovery probability of x has not been decreased by half. 

Second, the algorithm uses the best discovery probability for each relevant set. We call a relevant 
vertex set minimal if it has the minimum discovery probability among its subsets. In the quadratic 
algorithm, the goal is that one of the samples will produce an edge. According to the proof of 
Lemma 5, in the quadratic algorithm, we actually only need to draw samples for minimal relevant 
sets. In this algorithm, we hope that samples drawn for every relevant set will produce edges. But 
drawing samples for non-minimal relevant sets with discovery probabilities is not sufficient to avoid 
edges we have already found. Therefore, the best discovery probabilities are used. 

Finally, besides samples drawn proportional to degrees, the algorithm also draws samples pro- 
portional to the contribution of each relevant set. The idea is simple. Draw more samples for those 
relevant sets that are more likely to produce a new edge. The algorithm maintains a contribution 
counter c(x,) for each relevant set ¥, which records the number of new edges that y-samples have 
produced. As we have already said, different calls to FIND-ONE-EDGE at each iteration may 
output the same edge. As all calls to FIND-ONE-EDGE at each iteration are made in parallel, it 
is not clear which sample each new edge should be attributed to. To solve this problem, calls to 
FIND-ONE-EDGE are processed sequentially in an arbitrary order. 

In the algorithm, Fy consists of two parts: Fh and FÈ. In Fi the algorithm draws samples 
proportional to the contribution of each relevant set. FA is closer to Fy in Section 4. Intuitively, a 
high-degree relevant set in the target hypergraph (not necessarily a high-degree relevant set in H), 
or a relevant set with small threshold probability is important, because an edge or a non-edge may 
not be found if its important relevant subsets are not found. The smaller the threshold probability 
a relevant set is, the more important it is. The algorithm uses samples in F to find edges while 
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samples in ¥;; are mainly used to cover non-edges of H. F not only gives a short proof when 
H is indeed the target hypergraph, but also finds important relevant sets quickly. The design of 
F} guarantees that if the contribution of the most important subset of an edge or a non-edge stops 
doubling, a more important relevant subset will be discovered with high probability. 


5.2 The Algorithm 


Let H = (V,E) be the hypergraph the algorithm has found so far. 6’ is a parameter we will specify 
later. The algorithm is shown in Algorithm 6. At each iteration, the algorithm operates in two 
phases, the query phase and the computation phase. In the query phase, the algorithm draws random 
samples and make queries. The queries can be made in O(logm + r) rounds, as queries of each call 
to PARA-FIND-ONE-EDGE can be made in O(logm + r) rounds. In the computation phase, the 
algorithm processes the query results to update the contribution counter of each relevant set and also 
adds newly found relevant sets. 


Algorithm 6 The linear-query algorithm 
All PARA-FIND-ONE-EDGE’s are called with parameter 6’. 
1: e — PARA-FIND-ONE-EDGE(V ). 
2: E — {e}. c(@) —1. 





3: repeat 
QUERY PHASE 
4: Let F} be a family that for every known relevant set x contains c(y) -2’*? In 4 (%, 4p% (%))- 
samples. 
5: Let Få be a family that for every known relevant set x contains 2°”*3dy(x) In x (%, 1p% (%))- 
samples. 


6: Let fy = Fe U FR. Make queries on sets in Fy that are independent in H. 
7: Call PARA-FIND-ONE-EDGE on all positive samples. 
COMPUTATION PHASE 
8: For each known relevant set x, divide %-samples in F£ into c(%) groups of size 2”+? In z. 
9: Process the samples in F£ group by group in an arbitrary order. Increase c(%) by the number 
of new edges that y-samples produce. Add newly found edges to E. 
10: Process the samples in F4. Add newly found edges to E. 
11: For every newly found relevant set x, c(%) — 1. 
12: until no new edge is found 





We will show that the algorithm terminates in O(min(2’(logm+r),(logm+r)’)) iterations with 
high probability. Since ), dy (%) < 2’mand Y, c(x%) < (2" + 1)m (note that c(%) is one more than the 
number of new edges that y-samples in F} produce), the number of queries made at each iteration 
is at most O(24’"m- poly(r,logn, log ġ)). Therefore, the total number of queries will be linear in the 
number of edges with high probability, as desired. 


5.3 Analysis 


Consider some iteration of the algorithm. Let H be the hypergraph the algorithm has found at the 
beginning of the iteration. Let e be an edge that has not yet been found. Let x be a known subset of 


2228 


LEARNING A HIDDEN HYPERGRAPH 


e. x can be either active, in which case a x-sample is likely to contain an edge or inactive otherwise. 
Formally speaking, 


Definition 15 We say that % is active if py(Spi,(%)) = 1/2'*! or, equivalently, $pi,(%) > Py and 
inactive otherwise. 


The following two assertions serve as the goals for each iteration. 


Assertion 16 Consider one group of X-samples G in Fh. Let H' be the hypergraph the algorithm 
has found before this group is processed. If x is active, either p} (%) < 5 Pi(X) or G will produce 
a new edge. 


Assertion 17 If x is inactive, then at the end of this iteration, either e has been found or a subset 
of e whose threshold probability is at most 5 Py has been found (a relevant subset is found when an 
edge that contains it is found). 


The following two lemmas show that both assertions hold with high probability. 
Lemma 18 Assertion 16 is true with probability at least 1 — Ò. 


Proof If p% (x) = 5 Pi, (%), the probability that a y-sample contains an edge in H” is at most 


ae 2lxl 1 
E GO s Earn O | < a = aa 
X'EX X'EX 


On the other hand, since x is active, we have py (4p (%)) > 1/2"*!. That is, with probability at least 
1/2"+! a %-sample will contain an edge. Therefore the probability that a %-sample contains a new 
edge is at least 1/2’*' — 1/2’** = 1/2"*?. Recall that G contains 2’** In 4 x-samples. Therefore, 
with probability at least 1 — 6’ there exists at least one sample in G that will produce a new edge. Ml 


Lemma 19 Assertion 17 is true with probability at least 1 — Ò. 


Proof Let x* C x have the minimum discovery probability among all subsets of % at the beginning 
of the iteration. Thus, py(X*) = p4 (%) by the definition. Let us consider a y*-sample Py: in Få. 
Let A be the collection of all subsets of e whose threshold probabilities are not less than 5 Py. We 
do not want Py» to contain any edge that contains x’ for any x’ € A because they prevent us from 
discovering relevant sets with low threshold probabilities (< 5 Py): 

We observe that 4py(X*) = 4p (Xx) < py because x is inactive. Thus, we have that Vx’ € A, 


Pu (qPu(X")) < Pulp) < Par Py) = 1/2". 


Therefore, 





z - 
Pr[3 an edge e' C Py,e'Ne €Ale C Pe] < È PxC Pu(xX’)) < 1/2. 
EA 
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Combining with the fact that 


PrleCP ae Qe - 
rje & Fy] = qPHu X = QP +l" |+2+2r—21x"| dy (%*) = 23r+2dy (X) 





we have that with probability at least 1/(2°”*3dy(x")), Py contains e but does not contain any edge 
whose intersection with e is in A, in which case PARA-FIND-ONE-EDGE(P;-) either outputs e or 
outputs an edge whose intersection with e has threshold probability at most 5 Py. The probability 
that such a P,- exists in F% is at least 1 — 8’, as we draw at least 2°°*3dy(x*) Ins (X, gPH(X"))- 
samples. a 


Let H’ be the hypergraph that has been found at the end of the iteration. Let c,,(x) and cy (%) 
be the values of c(x) at the beginning and end of the iteration respectively. At each iteration, if no 
assertion is violated, one of the following two events happens. 


1. Either cy (X) > 2cy(X) or pin (%) < p% (x). (C(x) doubles when each of the cy (X) groups 
of x-samples in F} succeeds in producing a new edge.) 


2. Either e has been found or a subset of e whose threshold probability is at most 5 py has been 
found. 


That is, the two assertions guarantee that the algorithm makes definite progress at each iteration. 
The following lemma gives bound on the number of iterations of the algorithm. 


Lemma 20 Assuming no assertion is violated, the algorithm terminates in O(min(2"(logm + r), 
(logm-+r)*)) iterations. 


Proof First we remark that the minimum and maximum possible values for both discovery prob- 
abilities and threshold probabilities are 1/(27’t!m) and 1/2 respectively, and the minimum and 
maximum possible values for c(%) are 1 and m+ 1. 

For each edge e, we divide the iterations into phases until e is found. Each phase is associated 
with a known relevant subset % of e which has the minimum threshold probability at the beginning 
of the phase. A %-phase ends when % becomes inactive and then either e will be found or another 
relevant subset of e with at most half of y’s threshold probability will be found. Let us associate 
xs threshold probability with a y-phase. There are certainly at most 2” phases because this is a 
bound on the number of subsets of e. Moreover, there are at most O(logm + r) phases as the asso- 
ciated threshold probability halves at the end of each phase. Furthermore, each phase takes at most 
O(logm +r) iterations, since either c(%) doubles or the best discovery probability halves at each 
iteration. Therefore the algorithm terminates in O(min(2”(logm+r),(logm+r)7)) iterations. lH 


It is not hard to see that total number of assertions we need to satisfy before the algorithm 
succeeds is bounded by poly(2",m), including the assertions that each PARA-FIND-ONE-EDGE 
will succeed. Choose 8’ = @(6/poly(2",m)) and the algorithm will succeed with probability at 
least 1 — 6. Although the choice of 8’ requires knowledge of m, it is sufficient to use an upper bound 
of (o, and we have that log y < poly(r,logn) - log ‘ Since queries at each iteration are made in 
O(logm + r) rounds, it follows that 
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Theorem 21 With probability at least 1 — 6, Algorithm 6 learns an r-uniform hypergraph with m 
edges and n vertices, using O(2*"m- poly(r,logn, log s)) queries, in O(min(2" (logm +r)’, (logm+ 
r)?)) rounds. 


6. Lower Bounds For Almost Uniform Hypergraphs 


In this section, we derive a lower bound for the class of (r, A)-uniform hypergraphs. The following 
theorem is proved in Angluin and Chen (2004). 


Theorem 22 Q((2m/r)’/*) edge-detecting queries are required to identify a hypergraph drawn 
from the class of all (r,r —2)-uniform hypergraphs with n vertices and m edges. 


We show that by a simple reduction this gives us a lower bound for general (r,A)-uniform hyper- 
graphs. 

Theorem 23 Q((2m/(A 4+2))1+2) edge-detecting queries are required to identify a hypergraph 
drawn from the class of all (r,A)-uniform hypergraphs with n vertices and m edges. 


Proof Given a (A+2,A)-uniform hypergraph H = (V,E), let H’ = (V UV’, E’) be an (r, A)-uniform 
hypergraph, where |V’| = r—A—2, V'NV = ọ and F’ = {e UV'|e € E}. Any algorithm that learns 
H' can be converted to learn H with the same number of queries. x 


7. Learning Almost Uniform Hypergraphs 


In this section, we extend our results to learning (r, A)-uniform hypergraphs. The query upper bound 
stated in the following theorem matches the lower bound of Theorem 23 in terms of dependence on 
m. The round upper bound is only 1 + A times more than that of Algorithm 6. 


Theorem 24 There is a randomized algorithm that learns an (r,A)-uniform hypergraph with m 
edges and n vertices with probability at least 1 — 6, using O(20(1+3)") .m! +? - poly(logn, log $))) 
queries. Furthermore, the queries can be made in O((1 + A) - min(2"(logm + r)*,(logm+r)3)) 
rounds. 


7.1 The Algorithm 


One of the main modifications is the use of new discovery probabilities. We first provide some 
intuition for the new discovery probabilities. We have been choosing the discovery probability for 
a relevant set % to be inversely proportional to the (r — |x|)” root of its degree. It is so chosen that a 
x-sample has good chance of excluding edges that contain x. In an almost uniform hypergraph, we 
choose the discovery probabilities for the same purpose. In other words, we would like to choose p 
such that Veer e2% plei<1 /2'*?. Similarly, we should set p to be inversely proportional to the w” 
root of dy(x), where w = ming>y, |e\x| is the minimum difference in cardinalities between edges 
containing x and x. However, w is no longer equal to r — |x| as in uniform hypergraphs. There are 
two cases. When |x| < r—A, we have w > r—A-— |x| because the minimum edge size is r— A; 
when |x| > r— A, w can be as small as 1. 
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The case when w = 1 is special, as it implies that there exists an edge e such that |e\y| = 1 or e 
has only one vertex v that % does not have. We will call e a /-edge of y. On one hand, any x-sample 
containing v contains e, and hence is not an independent set; on the other hand, by excluding every 
vertex whose union with x contains an edge, we can easily exclude all corresponding edges. Thus 
we remove these vertices from each y-sample and the resulting sample, which we call a modified 
xX-sample, is an improvement over the original one. (We remark that this improvement is available 
for the uniform hypergraph problem in the case when |x| = r— 1, but is not as important.) More 
specifically, let v(x) be the set of all vertices v such that ¥U {v} contains an edge in H. A modified 
(x, p)-sample is a (X,V (X), p)-sample defined as follows. 


Definition 25 A (x,v, p)-sample (YM Vv = 0) is a random set of vertices that contains x, and does 
not contain any vertex in V and contains each other vertex independently with probability p. 


Algorithm 7 Learning an (r, A)-uniform hypergraph 
All PARA-FIND-ONE-EDGE’s are called with parameter 6’. 

1: e — PARA-FIND-ONE-EDGE(V). 
2: E + {e}. c(0) 1. 
3: repeat 

QUERY PHASE 
4: Let F} be a family that for every known relevant set x contains c(x) -2’*? In x modified 

(X, +P77(%))-samples and the same number of modified (x, Mag) -samples. 





) 
5: Let 4; be a family that for every known relevant set % contains 2(4/p,,(x))"#! In = modified 


(xX tp% (x))-samples and 2"+4+xldy(%)In 4 modified (x, zmaga) samples. 


6 Let fy = Fh U Fr: Make queries on sets in Fy that are independent in H. 
7: Call PARA-FIND-ONE-EDGE on all positive samples. 
COMPUTATION PHASE 
8: For each relevant set xy, divide %-samples in F in c(x) groups of 2"+? Inġ modified 
(X, +p% (%x))-samples and the same number of modified (x, zag samples. 


9: Process the samples in Fy group by group in an arbitrary order. Increase c(%) by the number 
of new edges that y-samples produce. Add newly found edges to E. 

10: Process the samples in F. Add newly found edges to E. 

11: 1-edge-finder: For any x-sample Py € Fz, let e be the output of PARA-FIND-ONE- 
EDGE(P,). Vv € e, make a query on %U {v} to test whether it is an edge. Add newly 
found edges to E. 

12: For every newly found relevant set x, c(%) — 1. 

13: until no new edge is found 





We remark that we can use original %-samples and obtain a much simpler algorithm than Algo- 
rithm 7. However, the query complexity will be roughly m4*? instead of m!+2, The reduction of 
the complexity in the exponent of m is due to the fact that each modified y-sample only needs to 
deal with edges that have at least 2 vertices that x does not have. This leads to the definition of the 
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new discovery probability as follows. 


op = Lea a), it al Sr 8-2 
Pu\X) = 1/(2°+ 2d (x))'/2, otherwise. 


We use the new discovery probabilities in Algorithm 7. Although we use modified samples, 
special care is still needed for 1-edges in order to parallelize the edge finding process. In fact, the 
majority of effort in developing Algorithm 7 is devoted to dealing with 1-edges. 

In both Fa and F we draw ( -samples in addition. The reason for the design will 


be made clear in the analysis section. A group of %-samples in Fy will consist of both (x, 5 Pi (X))- 
samples and (x, Wag) samples. F} contains (x, 4p% (x))-samples as in Algorithm 6. Al- 


Xo Flag) 


though, the number of (x, I Pi(X))-samples appears to be different from that of Algorithm 6, we 
remark that 2(4/p,,(x))"—#! In x is bounded by 27”+3dy(x) In x under the definition of discovery 
probabilities in Section 5 and this group of samples are designed for essentially the same purpose 
as those for Algorithm 6. We also use a subroutine called /-edge-finder, specified in Algorithm 7. 


7.2 Analysis 


Round complexity 

The following two definitions are analogous to those in Section 5. The extra subscript indi- 
cates that the new definitions depend on the already found sub-hypergraph H, while the previous 
definitions don’t. 


Definition 26 Let py (p) be the probability that a (%,VH(X), p)-sample is positive, where x is a 
vertex set that does not contain an edge. 


Definition 27 Let py g = min { p|py,4(p) = 1/2"*'} be the threshold probability of x. 


Now we bound the number of iterations of Algorithm 7. We divide the process of the algorithm 
into (1+ A) phases, each of which is indexed by a number in [r — A,r]. The phase / begins when all 
edges of size less than / have been found. Phase r — A is the first phase because there is no edge of 
size less than r — A. 

Let e be an edge of size / and % be a known relevant subset of e. We need to deal with two cases 
: |x| =1—1 and |x| < l — 2, the latter of which is simpler as every 1-edge of % has been discovered. 
We make the following definition. 


Definition 28 y% is active if it satisfies either of the following two conditions. 
1. |x| < 1-2 and Pya (P(X) 2 1/2. 
2. |x| =1—1 and pyn($ph(x)) > 1/2"! and prnl yig) = 1/2"*". 


It is inactive otherwise. 





The definition is analogous to that in Section 5, and so are the following assertions. The assertions 
are made at phase /. 


Assertion 29 Consider one group of %-samples G in Fé. Let H' be the hypergraph the algorithm 
has found before the group is processed. If% is active, one of the following three events happens. 
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1. Pip (X) < 3PH(X) 
2. di (X%) > 2d1 (xX) , or 
3. G will produce a new edge. 


Assertion 30 If x is inactive, at the end of this iteration, e has been found or a subset of e whose 
threshold probability is at most 5 Py,H has been found. 


The two assertions guarantee that Algorithm 7 makes a certain progress at each iteration. 


Lemma 31 If no assertion is violated, phase | terminates in O(min(2"(logm + r), (logm + r))) 
iterations. 


Proof We need to prove that every edge of size / can be found in the specified number iterations. Let 
e be an edge of size /. The proof proceeds similarly to that of Lemma 20. We divide the iterations 
into sub-phases, each of which is associated with a subset of e (we use sub-phases here to avoid 
confusion). Using an argument similar to that used in the proof of Lemma 20, we can show that 
each sub-phase takes O(logm + r) iterations. The only exception is that in this proof, the threshold 
probability of a set % might not be fixed (it depends on the already found sub-hypergraph H). When 
more |-edges of x are found, py,47(p) will decrease as more vertices are excluded from the sample. 
Therefore, py 7 might increase. After such a sub-phase, the associated threshold probability might 
not halve. However, this exception only happens when the subset associated with the sub-phase is 
of size /— 1 and only happens / < r times as there are at most / such subsets and causes at most / 
additional sub-phases. Therefore, we get the same asymptotic bound on the number of sub-phases, 
which is O(min(2’,logm-+r)). This establishes the lemma. a 


Now we show the two assertions are true with high probability. 
Lemma 32 Assertion 29 is true for Algorithm 7 with probability at least 1 — 8’. 


Proof G consists of two subgroups of samples with different sampling probabilities. In the analysis 
we will only consider one subgroup. In the case that |x| < /— 2, we use only (x, 5 Pi (x) )-samples. 
In the case that |x| = /—1, we will use the subgroup with the smaller sampling probability. Let n 
be the sampling probability of the subgroup we consider. We have n = 5 P(X) when |x| <7—2 
and n = min(4p7,(x), Wa) when |x| =/—1. By our definition of active, in both cases 


Px.a (N) > 1/2'*!. The probability that a modified (x,n)-sample contains an edge in H’ is at most 


y diy ( qe A— \x’|.2) + |Var(X%) \Va(X)|-n.- (1) 


ra 


e When |x| =1—2, |var(%)\vu(x)| = 0. Therefore, if n = 4p} (X) < Př (x), Equation (1) is 
at most 1/27*?, 


e When |x| =/—1, since every 1-edge of x must contain x in phase l, Equation (1) is bounded 
by 
E dan (al) OT) + de) -n 
If 4px) < pi (X) and diy (x) < 2dy(x), the above is bounded by 1/2’*?. 
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With probability at least 1/2’, a (x,n)-sample contains an edge that is not contained in H’. 
Thus, with probability at least 1 — 8’, G will produce a new edge. a 


Lemma 33 Assertion 30 is true for Algorithm 7 with probability at least 1 — 8’. 


Proof First we remark that if e has not been found, the probability that e is contained in a modified 
x%-sample is the same as for an unmodified one. This is because e does not contain any vertex in 
V(x). Otherwise, e contains an edge in H, which violates our assumption that edges do not contain 
each other. 

If Py u (4 p% (X)) < 1/2"*1, the proof proceeds similarly to that of Lemma 19. We remark that 
the differences are that py, and py are used instead of py and py and we draw more samples in 
Fi in Algorithm 7. 


The remaining case is when |x| = /— 1 and PHR < 1/2+!. Consider a 


(X, zag) Sample P, in F. Since e is of size 1, we have |e\x| = 1. Let {v} =e\x. We 


have that i 





and 





Pr[3 an edge e’ C Py such that v ¢ e' | v € Py] < Pyu ( eli. 


1 
2r+3+xldy(%) 


1/2"+1), P, contains v and contains only 





Therefore, with probability at least RELA ae 


edges that are incident with v. Our 1-edge-finder will find e in this case. As we draw 27+4+I4ldy(%) 
Ins samples, e will be found with probability at least 1 — 6’. a 


Since the algorithm has only 1 + A phases, the algorithm ends after O((1 + A) - min(2"(logm+ 
r), (logm + r)?)) iterations. If no assertion is violated, the round complexity of Algorithm 7 is 


O((1 +A) -min(2’(logm+r)*, (logm+r)?)) 


We can choose & so that the algorithm succeeds with probability 1 — 6 and log è < poly(r,logn) - 
log b. 
Query complexity 

The main discrepancy of the performance of this algorithm is due to the fact that in Få, the 
discovery probabilities are chosen as if all the edges were of minimum possible size, while the 
numbers of samples drawn are chosen as if all the non-edges (or potential edges) of H were of 
the maximum possible size. This causes the super-linear query complexity. At each iteration, the 
number of x-samples in FÅ is at most 


r—|x| 1 
O((2°® -du(x)) = -log = 


O((20) . dy(%)) 1? -log 5) otherwise. 


í ) if |y| <r—-A-2 
A4 pu 00) lin = = 


Note that (r — |x|)/(r —A— |x|) is at most 1+ $ when |x| <r—A-—2. Therefore, the number 
of modified y-samples in Få is at most O((20(”) -dy(x))'*2 -log x). Because Yydy(x) < 2m 
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and Vx,diu(x) < m, we have Yy dy (y)!*2 < (2’m) 1+2, Therefore, the total number of queries the 
algorithm makes is bounded by 


1 
o(20+3)) s m!t? - poly(logn, log 5): 


This finishes the proof of Theorem 24. 
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